Genomic adaptation of giant viruses in polar oceans

Despite being perennially frigid, polar oceans form an ecosystem hosting high and unique biodiversity. Various organisms show different adaptive strategies in this habitat, but how viruses adapt to this environment is largely unknown. Viruses of phyla Nucleocytoviricota and Mirusviricota are groups of eukaryote-infecting large and giant DNA viruses with genomes encoding a variety of functions. Here, by leveraging the Global Ocean Eukaryotic Viral database, we investigate the biogeography and functional repertoire of these viruses at a global scale. We first confirm the existence of an ecological barrier that clearly separates polar and nonpolar viral communities, and then demonstrate that temperature drives dramatic changes in the virus–host network at the polar–nonpolar boundary. Ancestral niche reconstruction suggests that adaptation of these viruses to polar conditions has occurred repeatedly over the course of evolution, with polar-adapted viruses in the modern ocean being scattered across their phylogeny. Numerous viral genes are specifically associated with polar adaptation, although most of their homologues are not identified as polar-adaptive genes in eukaryotes. These results suggest that giant viruses adapt to cold environments by changing their functional repertoire, and this viral evolutionary strategy is distinct from the polar adaptation strategy of their hosts.


Cumulative RPKM
2).These viruses, recently identified, possess indeed the morphogenetic module of this realm, even though a much larger fraction of informational genes relates them to Nucleocytoviricota 4 .
In agreement with previous results based on marker genes 5,6 , imiterviruses formed the largest group, with some members distributed widely (>100 samples) but at relatively low abundance (Supplementary Fig. 1b; Supplementary Fig. 2c,d).The cumulative RPKM (i.e., sum of the RPKM in all samples for a given genome) in the second largest group, algaviruses, was significantly higher than for other groups (Kruskal-Wallis test, P = 1.47×10 -21 ) but these viruses were observed in fewer samples (≤95 samples) than Imitervirales (Supplementary Fig. 2c; Supplementary Fig. 2d).

Distribution of giant viruses across size fractions
Most previous metagenomic surveys on giant viruses analysed data from a pico-size fraction (0.2-3.0 µm) 7,8 .Our dataset allowed us to assess abundance of viruses in larger size fractions up to the meso-fraction (200-2000 µm).Giant viruses were detected in both small and large size fractions at many stations (Supplementary Fig. 1a).Accordingly, size index values (i.e., a measure of sampling size fraction preference for each genome) were distributed widely for giant viruses compared with those of individual eukaryotic taxa (Supplementary Fig. 3a,b).Viral signals in large size fractions (e.g., >0.8 µm) may originate from viral genomes inside their host cells, whereas those in small size fractions (e.g., 0.2-3.0µm) may originate from viral genomes from either free virions or within host cells.
observed.The top five abundant/prevalent genomes are labelled.c, Locally estimated scatterplot smoothing plots of the latitudinal distributions of viral diversity (Shannon's index).The left panel presents the total diversity of all giant viruses along a latitudinal gradient in different size fractions.The Broad and Piconano size fractions were pooled because of their similar relative abundances and lack of Arctic samples in the Piconano size fraction.The right panel shows the diversity of communities of six main groups in the small-size fractions (Pico: 0.2-3 µm, Piconano: 0.8-5 µm, and Broad size: 0.8-2000 µm fractions).Supplementary Fig.2| Summary of the biogeography of viruses.a, Rarefaction curves for marine eukaryotic viral genomes in all samples and b, in subsamples separated by size fractions and biomes.For each curve, the average of 100 permutations is displayed with dots and the standard deviation is displayed with colour ranges.Rarefaction analyses reached a plateau

4 |. 5 | 6 |
Community structures of viruses.a, Venn diagram showing the numbers of shared or unique giant viruses across biomes.b,c NMDS ordination (Bray-Curtis dissimilarity) of the viruses communities for all samples using (b) relative abundance data (stress = 0.1617) and (c) presence/absence data (stress = 0.1541).Colours indicate biomes (left) and size fractions (right).Ellipses represent 95% confidence levels for each group.Statistical significance of the groupings was confirmed for both biome and size fraction groups (ANOSIM, P < 0.01) Locally estimated scatterplot smoothing plots of the latitudinal distributions of viral communities.Latitudinal variation in a, Shannon's index of communities after trimming the lowest 0.5%, 1%, 5%, 10%, and 50% of the total number of non-zero RPKM values.b, Pielou's evenness and c, richness of marine viral communities for different size fractions.Colours indicate size fractions.d, Shannon's index of communities of six main groups in large-size fractions (Nano: 5-20 µm, Micro: 20-200 µm, and Macro: 200-2000 µm size fractions).Shaded areas represent 95% confidence intervals.Eukaryotic communities in the virus-host interactome.a,b Locally estimated scatterplot smoothing plots of the latitudinal distributions of eukaryotic community diversity (Shannon's index).a, Diversity of communities of eukaryotes that have no association with viruses in the network.b, Diversity of communities of eukaryotes that have associations with viruses in the network.Colours stand for size fractions.Polar regions and equatorial regions were marked with coloured frames.(C) Histogram of temperature optima of eukaryotic nodes.Colours represent the phylum of eukaryotic nodes as indicated at the bottom of the plots.Eukaryotic nodes with low temperature optima were enriched in diatoms (Bacillariophyta).

. 7 |. 8 |
Host prediction of viruses.a, Phylogenetic tree based on four hallmark genes with putative host groups (coloured circles) predicted by TIM, which are summarized in (b).The outermost layer shows the taxonomy of six main groups.The middle and inner layers show the number of stations in which the viruses were observed and cumulative coverage, respectively.b, Summary of host prediction results.Left panel: line colours represent the six main groups; line widths are proportional to the number of clades predicted to the associated hosts.Right panel: boxplots show the number of viral insertions detected in predicted host genomes.c, ViralRecall scores of 12 Chaetocerotales MAGs and 2 isolates.An example of Chaetoceros tenuissimus contig was given.Viral scores were evaluated with a rolling window of 15 ORFs on a contig.Positive and negative scores represent viral and cellular regions, respectively.MAG 00108 TARA ARC 108 MAG 00116 TARA ARC 108 MAG 00189 TARA ARC 108 MAG 00217 TARA ARC 108 MAG 00232 TARA ARC 108 MAG 00267 TARA MED 95 MAG 00399 TARA PSE 93 MAG 00171 TARA PSE 93 MAG 00253 TARA PSW 86 MAG 00256 TARA SOC 28 MAG 00037 TARA SOC 28 MAG 00060 Niche assignments of viral genomes.a, Density plots of the latitude and temperature optima.Three biome assignment groups (Polar, Nonpolar, and Unknown) are displayed separately.b, The Polar clade containing the reference Organic Lake phycodnaviruses (blue circles).Ancestral states of Nonpolar and Polar were estimated using the phylogenetic tree based on a one-parameter equal rates model.Blue stands for Polar and red stands for Nonpolar viruses.Polar-specific KOs in polar viral genomes.a, Phylogenetic diversity of Polar-specific and Nonpolar-specific KOs.b, Ratio of Polar-specific KOs in Polar, Nonpolar and biome-unknown genomes.c, Boxplots of the temperature optima of Polar KOs enriched in Polar viral genomes at least one lineage (blue) and other KOs (KOs that were not enriched in Polar viral genomes at any lineage at four taxonomy levels, i.e., root, main group, family, and genus) (red).Enrichment analyses were performed at the main group, family, and genus levels.d, Alanine(A)-rich low-complexity regions (LCR) proportion in Polar, Nonpolar and biome-unknown genomes.six main groups based on our phylogenomic analysis, with genome sizes ranging from 50 Kb to 1.6 Mb: Algavirales (n = 155), Asfuvirales (n = 9), Imitervirales (n = 913), Pandoravirales (n = 81), Pimascovirales (n = 75), and Mirusviricota (n = 111).Of these, all but the Mirusviricota belong to the Nucleocytoviricota phylum of the realm Varidnaviria.Mirusviricota belongs to the realm Duplodnaviria (Supplementary Data